Make CREATE SCHEMA migration DDL concurrent-safe against pg_namespace race#282
Merged
jeremydmiller merged 1 commit intoMay 18, 2026
Merged
Conversation
The migrator's WriteSql block emitted "IF NOT EXISTS(information_schema.schemata) THEN EXECUTE 'CREATE SCHEMA …'", which is not atomic against concurrent sessions: two sessions can both pass the existence check, then race on the insert into pg_namespace and one loses with "23505 duplicate key value violates unique constraint pg_namespace_nspname_index". PostgreSQL's own "CREATE SCHEMA IF NOT EXISTS" has the same window. Wrap the create in a plpgsql sub-block that catches duplicate_schema (42P06) and unique_violation (23505) specifically, so the losing session treats the race as benign; any other error still propagates. Tracked downstream as Marten flake: JasperFx/marten#4445.
jeremydmiller
added a commit
to JasperFx/marten
that referenced
this pull request
May 18, 2026
…tgresql Weasel 9.0.0-alpha.5 ships in lockstep (Weasel.Postgresql ⇒ this PR's existing bump, Weasel.EntityFrameworkCore ⇒ this commit). Picks up the CREATE SCHEMA race fix from JasperFx/weasel#282 needed by the un-skipped full_text_index tests on this branch.
jeremydmiller
added a commit
to JasperFx/marten
that referenced
this pull request
May 18, 2026
…ma (#4446) * [#4445] Un-skip full_text_index race-flake tests; isolate shared schema Pairs with weasel#282, which catches the duplicate_schema / unique_violation that two concurrent sessions race on when both emit "CREATE SCHEMA <name>". With that DDL fix in Weasel.Postgresql 9.0.0-alpha.5, the FTI fixture passes deterministically as a suite (was 3+ tests failing intermittently with "23505 pg_namespace_nspname_index"). The "fulltext"-schema test additionally has a data-accumulation problem independent of the race: DocumentStore.For() with a custom DatabaseSchemaName bypasses OneOffConfigurationsContext's cleanAll step (which only drops the fixture's own schema). Drop the "fulltext" schema in a pre-flight outside the #region so the docs sample stays clean. * Bump Weasel.EntityFrameworkCore to 9.0.0-alpha.5 alongside Weasel.Postgresql Weasel 9.0.0-alpha.5 ships in lockstep (Weasel.Postgresql ⇒ this PR's existing bump, Weasel.EntityFrameworkCore ⇒ this commit). Picks up the CREATE SCHEMA race fix from JasperFx/weasel#282 needed by the un-skipped full_text_index tests on this branch.
jeremydmiller
added a commit
that referenced
this pull request
May 24, 2026
closes #293) (#294) Follow-up to #282. That fix wrapped only CREATE SCHEMA in a plpgsql block catching 42P06/23505. The rest of the migration DDL emitted by migration.WriteAllUpdates(...) — CREATE OR REPLACE FUNCTION mt_immutable_*, CREATE TABLE IF NOT EXISTS, etc. — stayed unguarded, so concurrent lazy EnsureStorageExists against the same database could fail with "XX000: tuple concurrently updated" (two backends updating the same pg_proc / catalog row at once). Seen as a Marten conjoined multi-tenant CI flake (JasperFx/marten#4552). The DDL Weasel emits is idempotent (IF NOT EXISTS / CREATE OR REPLACE) and executeDelta runs each statement on the bare connection (autocommit, no ambient transaction), so a statement that lost a catalog race is safe to re-run from a clean slate. Wrap each cmd.ExecuteNonQueryAsync in a bounded retry (3 attempts, jittered sub-100ms backoff) keyed on the transient SQLSTATEs: - 40001 serialization_failure - 40P01 deadlock_detected - XX000 internal_error ONLY when the message is "tuple concurrently updated" (XX000 is a catch-all, so the message guard avoids blanket-retrying unrelated internal errors) The retry sits inside the existing try/catch, so after exhausting attempts the established logger.OnFailure / rethrow path is unchanged. Kept PG-specific (these SQLSTATEs are PostgreSQL's; executeDelta is a per-provider override). The classification is extracted as a pure internal predicate IsTransientCatalogConcurrency(sqlState, messageText) and covered by 12 unit assertions (the race itself isn't deterministically testable). Happy-path migration/schema integration tests pass unchanged (21/22, 1 pre-existing skip). Downstream Marten#4552 stabilizes once it consumes a Weasel build with this. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PostgresqlMigrator.WriteSqlemittedIF NOT EXISTS(information_schema.schemata) THEN EXECUTE 'CREATE SCHEMA …', which is not atomic against concurrent sessions: both sessions can pass the existence check, then race on the insert intopg_namespaceand one loses with23505 duplicate key value violates unique constraint pg_namespace_nspname_index. PostgreSQL's ownCREATE SCHEMA IF NOT EXISTShas the same window (see e.g. pg-hackers thread).duplicate_schema(42P06) andunique_violation(23505) specifically, so the losing session treats the race as benign. Any other error still propagates.full_text_indextests fail intermittently against the existing master with the exact 23505 symptom.Why this minimal form
The published DDL is wrapped in an outer
DO $$ BEGIN … END $$;(one block per migration); each schema gets a nestedBEGIN … EXCEPTION … END;sub-block so the EXCEPTION clause only suppresses the race for the currentCREATE SCHEMA, not for the rest of the migration's DDL.Test plan
dotnet build src/Weasel.Postgresql/Weasel.Postgresql.csprojsucceedsDocumentDbTests.Indexes.full_text_index(31 tests, 5 currently[Fact(Skip = "…#4445")]) — unskipping all 5 and running 3× in a row withDISABLE_TEST_PARALLELIZATION=trueis now31/31 passeddeterministically (wasFailed: 3+/31intermittently)Weasel.Postgresql.Testssuite (the Marten-side validation is the strongest signal because Weasel has no test asserting on the DDL emission text)🤖 Generated with Claude Code